Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ci): save cache on build-main workflows and only restore cache on docker-build-and-push-main workflows #4865

Merged
merged 10 commits into from
Jun 14, 2024

Conversation

youtalk
Copy link
Member

@youtalk youtalk commented Jun 13, 2024

Description

The reproducible-containers/buildkit-cache-dance action introduced in #4854, unfortunately, might not support extracting the cache from the docker buildx bake and only supports docker build.

I confirmed that replacing docker/bake-action with docker/build-and-push-action as per the official sample code does indeed enable the cache to work.
https://github.com/reproducible-containers/buildkit-cache-dance?tab=readme-ov-file#examples

Therefore, in this PR, we will create a cache by running docker/build-and-push-action in the build-main and build-main-self-hosted workflows, which are scheduled to run daily, and in the sequentially executed docker-build-and-push-main and docker-build-and-push-main-self-hosted workflows, we will run docker/bake-action as in the past and only load the cache.

It turns out that the cache was effective in #4854 because I forgot to delete the cache from the commit during my experiment, which caused the cache to be effective. I apologize for the oversight.

Tests performed

Using this commit to temporarily run the build-main workflows in this PR, we confirmed that the execution time from the second run onwards was drastically reduced from the previous 2.5 hours to 1 hour.

https://autowarefoundation.github.io/autoware-ci-metrics/

Screenshot 2024-06-14 at 8 49 33

Effects on system behavior

Not applicable.

Interface changes

Pre-review checklist for the PR author

The PR author must check the checkboxes below when creating the PR.

In-review checklist for the PR reviewers

The PR reviewers must check the checkboxes below before approval.

Post-review checklist for the PR author

The PR author must check the checkboxes below before merging.

  • There are no open discussions or they are tracked via tickets.

After all checkboxes are checked, anyone who has write access can merge the PR.

youtalk added 6 commits June 13, 2024 11:11
Signed-off-by: Yutaka Kondo <[email protected]>
Signed-off-by: Yutaka Kondo <[email protected]>
Signed-off-by: Yutaka Kondo <[email protected]>
Signed-off-by: Yutaka Kondo <[email protected]>
Signed-off-by: Yutaka Kondo <[email protected]>
Signed-off-by: Yutaka Kondo <[email protected]>
@youtalk youtalk requested a review from oguzkaganozt as a code owner June 13, 2024 06:52
@youtalk youtalk marked this pull request as draft June 13, 2024 06:52
@youtalk youtalk self-assigned this Jun 13, 2024
@youtalk
Copy link
Member Author

youtalk commented Jun 13, 2024

The workflow namings are unclear, so I would like to refactor and rename them in another PR.
ref. #4747

youtalk added 4 commits June 13, 2024 21:39
Signed-off-by: Yutaka Kondo <[email protected]>
This reverts commit adced0d.
@youtalk
Copy link
Member Author

youtalk commented Jun 14, 2024

Unfortunately, this PR still has an issue where the docker-build-and-push-main workflows cannot restore the cache, even though a fully matching cache is already available. However, since this issue can be resolved in a separate PR, we would like to consider this PR is ready for review for now.

Cache not found for input keys: cache-amd64-cuda-0187c3f0f9a23c6232cfe50f211e9deee563e665bcd440c1d921e03bb67898cc, cache-amd64-cuda-, cache-amd64-

https://github.com/autowarefoundation/autoware/actions/runs/9500467708/job/26183604143#step:6:13

Screenshot 2024-06-14 at 9 16 55

https://github.com/autowarefoundation/autoware/actions/caches

@youtalk youtalk marked this pull request as ready for review June 14, 2024 00:22
@youtalk youtalk requested review from xmfcx and mitsudome-r June 14, 2024 00:23
@youtalk youtalk added the type:ci Continuous Integration (CI) processes and testing. label Jun 14, 2024
@youtalk youtalk merged commit 7738b5a into main Jun 14, 2024
19 checks passed
@youtalk youtalk deleted the upstream-docker-build-action branch June 14, 2024 05:57
@xmfcx
Copy link
Contributor

xmfcx commented Jun 16, 2024

Unfortunately, this PR still has an issue where the docker-build-and-push-main workflows cannot restore the cache, even though a fully matching cache is already available. However, since this issue can be resolved in a separate PR, we would like to consider this PR is ready for review for now.

This is happens because the workflow trigger of schedule isolates the cache.
Only scheduled build-main can get the cache of another scheduled build-main job, nothing else.
I guess not even workflow_dispatch event can reach it.

See: https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows#restrictions-for-accessing-a-cache

For this to work as intended, docker-build-and-push-main should maintain its own cache.

Also

  • instead of: cache-${{ inputs.platform }}-${{ inputs.name }}-${{ hashFiles('autoware.repos') }}
  • I propose: ccache-build-main-${{ runner.arch }}-humble-${{ github.sha }}

Reasons:

  • We don't need ${{ inputs.platform }} because ${{ runner.arch }} gives the similar result by using more generic variables.
    • Possible values are X86, X64, ARM, or ARM64.
  • ${{ inputs.name }} is cuda or no-cuda
    • We should only generate the cache for cuda version. Because no-cuda packages are a subset of cuda packages.
    • no-cuda version can restore and use the same cache.
  • Strictly separate actions/cache/save@v4 and actions/cache/restore@v4 actions instead of just using actions/cache@v4 action.
    • This way we know who and when updates the cache and who just uses it.

The reason for -${{ github.sha }} is to know where the cache was created from.

restore-keys: |
  ccache-build-main-${{ runner.arch }}-humble-

Will fetch the most fresh version anyways. As long as it is unique, it will be enough.

@youtalk -san, I know you are busy this week and I would like to implement all I've mentioned until wednesday, is that OK?

@youtalk
Copy link
Member Author

youtalk commented Jun 17, 2024

@xmfcx It's my pleasure! Please go ahead.

pravinkmr26 pushed a commit to pravinkmr26/autoware that referenced this pull request Jul 15, 2024
… on `docker-build-and-push-main` workflows (autowarefoundation#4865)

* fix ccache_dir

Signed-off-by: Yutaka Kondo <[email protected]>

* update build-main

Signed-off-by: Yutaka Kondo <[email protected]>

* restore only

Signed-off-by: Yutaka Kondo <[email protected]>

* reorder

Signed-off-by: Yutaka Kondo <[email protected]>

* MUST REVERT

Signed-off-by: Yutaka Kondo <[email protected]>

* add cache-tag-suffix arg

Signed-off-by: Yutaka Kondo <[email protected]>

* rename

Signed-off-by: Yutaka Kondo <[email protected]>

* Revert "MUST REVERT"

This reverts commit adced0d.

* Revert "Revert "MUST REVERT""

This reverts commit 7004dbe.

* Revert "Revert "Revert "MUST REVERT"""

This reverts commit e6d62e4.

---------

Signed-off-by: Yutaka Kondo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:ci Continuous Integration (CI) processes and testing.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants